Overview

Dataset statistics

Number of variables25
Number of observations138556
Missing cells137135
Missing cells (%)4.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory26.4 MiB
Average record size in memory200.0 B

Variable types

CAT17
NUM8

Warnings

DOB has a high cardinality: 900 distinct values High cardinality
DOD has 137135 (99.0%) missing values Missing
BID has unique values Unique
County has 2977 (2.1%) zeros Zeros
InpatientAnnualReimbursementAmt has 102511 (74.0%) zeros Zeros
InpatientAnnualDeductibleAmt has 102019 (73.6%) zeros Zeros
OutpatientAnnualReimbursementAmt has 4205 (3.0%) zeros Zeros
OutpatientAnnualDeductibleAmt has 13890 (10.0%) zeros Zeros

Reproduction

Analysis started2020-10-13 16:28:03.353280
Analysis finished2020-10-13 16:28:43.920685
Duration40.57 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

BID
Categorical

UNIQUE

Distinct138556
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
BENE147419
 
1
BENE131588
 
1
BENE156585
 
1
BENE134117
 
1
BENE49085
 
1
Other values (138551)
138551 
ValueCountFrequency (%) 
BENE1474191< 0.1%
 
BENE1315881< 0.1%
 
BENE1565851< 0.1%
 
BENE1341171< 0.1%
 
BENE490851< 0.1%
 
BENE1515251< 0.1%
 
BENE1566111< 0.1%
 
BENE454801< 0.1%
 
BENE387651< 0.1%
 
BENE933651< 0.1%
 
Other values (138546)138546> 99.9%
 
2020-10-13T12:28:44.592363image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique138556 ?
Unique (%)100.0%
2020-10-13T12:28:44.814476image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length10
Median length9
Mean length9.39984555
Min length9

DOB
Categorical

HIGH CARDINALITY

Distinct900
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
1939-10-01
 
540
1941-10-01
 
538
1939-03-01
 
535
1940-03-01
 
526
1939-04-01
 
517
Other values (895)
135900 
ValueCountFrequency (%) 
1939-10-015400.4%
 
1941-10-015380.4%
 
1939-03-015350.4%
 
1940-03-015260.4%
 
1939-04-015170.4%
 
1941-05-015130.4%
 
1943-12-015120.4%
 
1941-12-015120.4%
 
1942-12-015090.4%
 
1943-11-015090.4%
 
Other values (890)13334596.2%
 
2020-10-13T12:28:45.028525image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-13T12:28:45.200319image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length10
Median length10
Mean length10
Min length10

DOD
Categorical

MISSING

Distinct11
Distinct (%)0.8%
Missing137135
Missing (%)99.0%
Memory size1.1 MiB
2009-12-01
182 
2009-10-01
168 
2009-09-01
164 
2009-11-01
149 
2009-08-01
144 
Other values (6)
614 
ValueCountFrequency (%) 
2009-12-011820.1%
 
2009-10-011680.1%
 
2009-09-011640.1%
 
2009-11-011490.1%
 
2009-08-011440.1%
 
2009-07-011410.1%
 
2009-05-011190.1%
 
2009-06-011190.1%
 
2009-04-01940.1%
 
2009-03-01910.1%
 
(Missing)13713599.0%
 
2020-10-13T12:28:45.368205image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-13T12:28:45.541248image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length10
Median length3
Mean length3.071790467
Min length3

Gender
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
2
79106 
1
59450 
ValueCountFrequency (%) 
27910657.1%
 
15945042.9%
 
2020-10-13T12:28:45.709803image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-13T12:28:45.806029image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:45.923106image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Race
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
1
117057 
2
13538 
3
 
5059
5
 
2902
ValueCountFrequency (%) 
111705784.5%
 
2135389.8%
 
350593.7%
 
529022.1%
 
2020-10-13T12:28:46.088823image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-13T12:28:46.187067image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:46.318643image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

RenalDisease
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
118978 
Y
19578 
ValueCountFrequency (%) 
011897885.9%
 
Y1957814.1%
 
2020-10-13T12:28:46.483162image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-13T12:28:46.571872image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:46.698824image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

State
Real number (ℝ≥0)

Distinct52
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.66673403
Minimum1
Maximum54
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB
2020-10-13T12:28:46.877148image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q111
median25
Q339
95-th percentile50
Maximum54
Range53
Interquartile range (IQR)28

Descriptive statistics

Standard deviation15.22344304
Coefficient of variation (CV)0.5931196007
Kurtosis-1.249185339
Mean25.66673403
Median Absolute Deviation (MAD)14
Skewness0.08072830774
Sum3556280
Variance231.7532179
MonotocityNot monotonic
2020-10-13T12:28:47.106587image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
5120528.7%
 
1097717.1%
 
4587806.3%
 
3384436.1%
 
3960554.4%
 
1459234.3%
 
3653663.9%
 
2352933.8%
 
3446293.3%
 
3141243.0%
 
Other values (42)6812049.2%
 
ValueCountFrequency (%) 
126151.9%
 
21960.1%
 
323951.7%
 
418171.3%
 
5120528.7%
 
ValueCountFrequency (%) 
5412370.9%
 
532950.2%
 
5226621.9%
 
5112120.9%
 
5027932.0%
 

County
Real number (ℝ≥0)

ZEROS

Distinct314
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean374.4247452
Minimum0
Maximum999
Zeros2977
Zeros (%)2.1%
Memory size1.1 MiB
2020-10-13T12:28:47.303494image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile20
Q1141
median340
Q3570
95-th percentile881
Maximum999
Range999
Interquartile range (IQR)429

Descriptive statistics

Standard deviation266.2775811
Coefficient of variation (CV)0.711164485
Kurtosis-0.7522266009
Mean374.4247452
Median Absolute Deviation (MAD)200
Skewness0.4671250956
Sum51878795
Variance70903.75021
MonotocityNot monotonic
2020-10-13T12:28:47.509346image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
20039432.8%
 
1035872.6%
 
2031762.3%
 
6030032.2%
 
029772.1%
 
9028332.0%
 
47027682.0%
 
40027382.0%
 
16025261.8%
 
15024111.7%
 
Other values (304)10859478.4%
 
ValueCountFrequency (%) 
029772.1%
 
13< 0.1%
 
1035872.6%
 
1164< 0.1%
 
142< 0.1%
 
ValueCountFrequency (%) 
9992640.2%
 
99616< 0.1%
 
99425< 0.1%
 
99317< 0.1%
 
99252< 0.1%
 

NumOfMonths_PartACov
Real number (ℝ≥0)

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.90772684
Minimum0
Maximum12
Zeros1000
Zeros (%)0.7%
Memory size1.1 MiB
2020-10-13T12:28:47.849885image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile12
Q112
median12
Q312
95-th percentile12
Maximum12
Range12
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.032331747
Coefficient of variation (CV)0.08669427511
Kurtosis126.3832027
Mean11.90772684
Median Absolute Deviation (MAD)0
Skewness-11.29222571
Sum1649887
Variance1.065708835
MonotocityNot monotonic
2020-10-13T12:28:48.021609image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%) 
1213738999.2%
 
010000.7%
 
638< 0.1%
 
1128< 0.1%
 
826< 0.1%
 
1018< 0.1%
 
716< 0.1%
 
413< 0.1%
 
58< 0.1%
 
97< 0.1%
 
Other values (3)13< 0.1%
 
ValueCountFrequency (%) 
010000.7%
 
13< 0.1%
 
25< 0.1%
 
35< 0.1%
 
413< 0.1%
 
ValueCountFrequency (%) 
1213738999.2%
 
1128< 0.1%
 
1018< 0.1%
 
97< 0.1%
 
826< 0.1%
 

NumOfMonths_PartBCov
Real number (ℝ≥0)

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.91014463
Minimum0
Maximum12
Zeros675
Zeros (%)0.5%
Memory size1.1 MiB
2020-10-13T12:28:48.159898image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile12
Q112
median12
Q312
95-th percentile12
Maximum12
Range12
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.9368933355
Coefficient of variation (CV)0.07866347254
Kurtosis135.9775469
Mean11.91014463
Median Absolute Deviation (MAD)0
Skewness-11.4782554
Sum1650222
Variance0.877769122
MonotocityNot monotonic
2020-10-13T12:28:48.324359image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%) 
1213690298.8%
 
06750.5%
 
62820.2%
 
101500.1%
 
111430.1%
 
91220.1%
 
8710.1%
 
763< 0.1%
 
550< 0.1%
 
435< 0.1%
 
Other values (3)63< 0.1%
 
ValueCountFrequency (%) 
06750.5%
 
117< 0.1%
 
219< 0.1%
 
327< 0.1%
 
435< 0.1%
 
ValueCountFrequency (%) 
1213690298.8%
 
111430.1%
 
101500.1%
 
91220.1%
 
8710.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
2
92530 
1
46026 
ValueCountFrequency (%) 
29253066.8%
 
14602633.2%
 
2020-10-13T12:28:48.489877image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-13T12:28:48.592548image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:48.715940image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
2
70154 
1
68402 
ValueCountFrequency (%) 
27015450.6%
 
16840249.4%
 
2020-10-13T12:28:48.875068image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-13T12:28:48.968540image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:49.094245image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
2
95277 
1
43279 
ValueCountFrequency (%) 
29527768.8%
 
14327931.2%
 
2020-10-13T12:28:49.235840image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-13T12:28:49.319386image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:49.442125image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Chronic_Cancer
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
2
121935 
1
16621 
ValueCountFrequency (%) 
212193588.0%
 
11662112.0%
 
2020-10-13T12:28:49.596744image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-13T12:28:49.679719image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:49.809131image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
2
105697 
1
32859 
ValueCountFrequency (%) 
210569776.3%
 
13285923.7%
 
2020-10-13T12:28:49.983810image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-13T12:28:50.080163image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:50.191856image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
2
89296 
1
49260 
ValueCountFrequency (%) 
28929664.4%
 
14926035.6%
 
2020-10-13T12:28:50.353858image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-13T12:28:50.455211image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:50.575198image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Chronic_Diabetes
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
1
83391 
2
55165 
ValueCountFrequency (%) 
18339160.2%
 
25516539.8%
 
2020-10-13T12:28:50.734661image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-13T12:28:50.827697image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:50.951070image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
1
93644 
2
44912 
ValueCountFrequency (%) 
19364467.6%
 
24491232.4%
 
2020-10-13T12:28:51.100091image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-13T12:28:51.183488image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:51.298324image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
2
100497 
1
38059 
ValueCountFrequency (%) 
210049772.5%
 
13805927.5%
 
2020-10-13T12:28:51.450332image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-13T12:28:51.547364image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:51.673915image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
2
102972 
1
35584 
ValueCountFrequency (%) 
210297274.3%
 
13558425.7%
 
2020-10-13T12:28:51.830550image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-13T12:28:51.929128image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:52.051653image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Chronic_stroke
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
2
127602 
1
 
10954
ValueCountFrequency (%) 
212760292.1%
 
1109547.9%
 
2020-10-13T12:28:52.201164image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-13T12:28:52.292265image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:52.412670image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

InpatientAnnualReimbursementAmt
Real number (ℝ)

ZEROS

Distinct3004
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3660.346502
Minimum-8000
Maximum161470
Zeros102511
Zeros (%)74.0%
Memory size1.1 MiB
2020-10-13T12:28:52.585612image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum-8000
5-th percentile0
Q10
median0
Q32280
95-th percentile20260
Maximum161470
Range169470
Interquartile range (IQR)2280

Descriptive statistics

Standard deviation9568.621827
Coefficient of variation (CV)2.614130061
Kurtosis31.02521782
Mean3660.346502
Median Absolute Deviation (MAD)0
Skewness4.636541731
Sum507162970
Variance91558523.67
MonotocityNot monotonic
2020-10-13T12:28:52.786010image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
010251174.0%
 
400021231.5%
 
500018511.3%
 
300018001.3%
 
600015891.1%
 
700012740.9%
 
800012300.9%
 
900011050.8%
 
1000010850.8%
 
1100010180.7%
 
Other values (2994)2297016.6%
 
ValueCountFrequency (%) 
-80001< 0.1%
 
-14001< 0.1%
 
-10001< 0.1%
 
-6401< 0.1%
 
-5001< 0.1%
 
ValueCountFrequency (%) 
1614701< 0.1%
 
1556001< 0.1%
 
1552701< 0.1%
 
1535801< 0.1%
 
1485801< 0.1%
 

InpatientAnnualDeductibleAmt
Real number (ℝ≥0)

ZEROS

Distinct147
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean399.8472964
Minimum0
Maximum38272
Zeros102019
Zeros (%)73.6%
Memory size1.1 MiB
2020-10-13T12:28:52.986077image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31068
95-th percentile2136
Maximum38272
Range38272
Interquartile range (IQR)1068

Descriptive statistics

Standard deviation956.1752023
Coefficient of variation (CV)2.391350926
Kurtosis268.1092213
Mean399.8472964
Median Absolute Deviation (MAD)0
Skewness10.45326205
Sum55401242
Variance914271.0176
MonotocityNot monotonic
2020-10-13T12:28:53.200313image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
010201973.6%
 
10682711319.6%
 
213664184.6%
 
320414811.1%
 
42723690.3%
 
3068980.1%
 
2068870.1%
 
5340790.1%
 
406867< 0.1%
 
506859< 0.1%
 
Other values (137)7660.6%
 
ValueCountFrequency (%) 
010201973.6%
 
10682711319.6%
 
10882< 0.1%
 
10984< 0.1%
 
11181< 0.1%
 
ValueCountFrequency (%) 
382721< 0.1%
 
372041< 0.1%
 
361363< 0.1%
 
352041< 0.1%
 
350684< 0.1%
 

OutpatientAnnualReimbursementAmt
Real number (ℝ)

ZEROS

Distinct2078
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1298.219348
Minimum-70
Maximum102960
Zeros4205
Zeros (%)3.0%
Memory size1.1 MiB
2020-10-13T12:28:53.399261image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum-70
5-th percentile20
Q1170
median570
Q31500
95-th percentile4370
Maximum102960
Range103030
Interquartile range (IQR)1330

Descriptive statistics

Standard deviation2493.901134
Coefficient of variation (CV)1.921016766
Kurtosis159.619928
Mean1298.219348
Median Absolute Deviation (MAD)480
Skewness8.606026117
Sum179876080
Variance6219542.865
MonotocityNot monotonic
2020-10-13T12:28:53.614079image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
042053.0%
 
10039162.8%
 
20031532.3%
 
6026941.9%
 
30022801.6%
 
9021781.6%
 
8020871.5%
 
5020861.5%
 
4020481.5%
 
40020451.5%
 
Other values (2068)11186480.7%
 
ValueCountFrequency (%) 
-701< 0.1%
 
-603< 0.1%
 
-502< 0.1%
 
-401< 0.1%
 
-202< 0.1%
 
ValueCountFrequency (%) 
1029601< 0.1%
 
1012501< 0.1%
 
975101< 0.1%
 
949101< 0.1%
 
869801< 0.1%
 

OutpatientAnnualDeductibleAmt
Real number (ℝ≥0)

ZEROS

Distinct789
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean377.7182583
Minimum0
Maximum13840
Zeros13890
Zeros (%)10.0%
Memory size1.1 MiB
2020-10-13T12:28:54.062112image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q140
median170
Q3460
95-th percentile1340
Maximum13840
Range13840
Interquartile range (IQR)420

Descriptive statistics

Standard deviation645.5301866
Coefficient of variation (CV)1.709025636
Kurtosis47.74698455
Mean377.7182583
Median Absolute Deviation (MAD)150
Skewness5.435348431
Sum52335131
Variance416709.2218
MonotocityNot monotonic
2020-10-13T12:28:54.255213image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
01389010.0%
 
2072715.2%
 
1061404.4%
 
3047553.4%
 
10047433.4%
 
4043123.1%
 
20040432.9%
 
5035772.6%
 
6032662.4%
 
7030812.2%
 
Other values (779)8347860.2%
 
ValueCountFrequency (%) 
01389010.0%
 
1061404.4%
 
2072715.2%
 
3047553.4%
 
4043123.1%
 
ValueCountFrequency (%) 
138401< 0.1%
 
130401< 0.1%
 
120901< 0.1%
 
118001< 0.1%
 
115701< 0.1%
 

Interactions

2020-10-13T12:28:29.301252image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:29.492366image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:29.659339image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:29.831882image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:30.008325image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:30.180533image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:30.355059image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:30.527696image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:30.699332image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:30.866009image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:31.062730image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:31.229672image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:31.402097image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:31.575832image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:31.750895image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:31.920291image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:32.099743image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:32.270115image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:32.445985image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:32.616158image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:32.794870image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:32.976489image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:33.188863image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:33.361793image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:33.537601image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:33.845758image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:34.026390image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:34.207696image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:34.387290image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:34.567250image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:34.752516image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:34.924598image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:35.109059image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:35.284409image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:35.454946image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:35.633607image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:35.821085image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:36.028975image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:36.224282image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:36.393369image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:36.576582image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:36.759807image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:36.935492image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:37.119732image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:37.299697image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:37.489397image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:37.761857image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:37.999925image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:38.206077image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:38.373515image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:38.542045image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:38.709070image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:38.884337image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:39.058740image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:39.266814image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:39.460492image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:39.652478image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:39.834828image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:40.138116image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:40.310543image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:40.486652image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:40.669896image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:40.859510image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:41.034382image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2020-10-13T12:28:54.459652image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-10-13T12:28:54.801196image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-10-13T12:28:55.132919image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-10-13T12:28:55.472520image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-10-13T12:28:55.801901image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-10-13T12:28:41.597405image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:42.489547image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-13T12:28:43.547569image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Sample

First rows

BIDDOBDODGenderRaceRenalDiseaseStateCountyNumOfMonths_PartACovNumOfMonths_PartBCovChronic_AlzheimerChronic_HeartfailureChronic_KidneyDiseaseChronic_CancerChronic_ObstrPulmonaryChronic_DepressionChronic_DiabetesChronic_IschemicHeartChronic_OsteoporasisChronic_rheumatoidarthritisChronic_strokeInpatientAnnualReimbursementAmtInpatientAnnualDeductibleAmtOutpatientAnnualReimbursementAmtOutpatientAnnualDeductibleAmt
0BENE110011943-01-01NaN110392301212121221112113600032046070
1BENE110021936-09-01NaN21039280121222222222222003050
2BENE110031936-08-01NaN11052590121212222221222009040
3BENE110041922-07-01NaN11039270121211222211112001810760
4BENE110051935-09-01NaN110246801212222212122220017901200
5BENE110061976-09-01NaN21023810121222222222222005000
6BENE110071940-09-012009-12-0112045610121211222212112001490160
7BENE110081934-02-01NaN2101514012122222221222200300
8BENE110091929-06-01NaN11Y44230121221222212222001000
9BENE110101936-07-01NaN2104130121221211211122001170660

Last rows

BIDDOBDODGenderRaceRenalDiseaseStateCountyNumOfMonths_PartACovNumOfMonths_PartBCovChronic_AlzheimerChronic_HeartfailureChronic_KidneyDiseaseChronic_CancerChronic_ObstrPulmonaryChronic_DepressionChronic_DiabetesChronic_IschemicHeartChronic_OsteoporasisChronic_rheumatoidarthritisChronic_strokeInpatientAnnualReimbursementAmtInpatientAnnualDeductibleAmtOutpatientAnnualReimbursementAmtOutpatientAnnualDeductibleAmt
138546BENE1591881938-10-01NaN11Y31150121221212211222158001068614060
138547BENE1591891941-04-01NaN2101836112122221222112200182040
138548BENE1591901939-11-01NaN1102910121221122211221001600
138549BENE1591911926-10-01NaN1103671012121222122112100640250
138550BENE1591921937-04-01NaN210213012122222221121200420100
138551BENE1591941939-07-01NaN1103914012121222222222200430460
138552BENE1591951938-12-01NaN2104953012121222221222200880100
138553BENE1591961916-06-01NaN21061501212211121112222000106832401390
138554BENE1591971930-01-01NaN1101656012121122222122200265010
138555BENE1591981952-04-01NaN21021201212112221122120054701870